System and method for a direct computation of cubic spline interpolation for real-time image codec

ABSTRACT

A cubic spline interpolation (CSI) method and apparatus for image compression by direct computation. Such a CSI method is used along with the JPEG standard to obtain a new CSI-JPEG encoder-decoder (Codec). In one embodiment, the present invention is a method and apparatus for a new CSI-JPEG codec enabling a pipelined compression method that is naturally suitable for hardware or firmware implementations.

CROSS REFERENCE TO RELATED APPLICATIONS

This patent application claims the benefit of the filing date of U.S.Provisional Patent Application Ser. No. 60/478,381, filed Jun. 12, 2003,and entitled “System and Method for A Direct Computation of Cubic SplineInterpolation for Real-Time Image Codec,” the entire contents of whichare hereby expressly incorporated by reference.

FIELD OF THE INVENTION

This invention relates to data compression. More specifically, theinvention relates to a new cubic-spline interpolation (CSI) for both 1-Dand 2-D signals to sub-sample signal and image compression data.

BACKGROUND OF THE INVENTION

In T. K. Truong, L. J. Wang, I. S. Reed, and W. S. Hsieh, “Image datacompression using cubic convolution spline interpolation,” IEEE Trans.on Image Processing, vol. 9, no. 11, pp. 1988-1995, November 2000 [1];and L. J. Wang, W. S. Hsieh, T. K. Truong, I. S. Reed, and T. C. Cheng,“A fast efficient computation of cubic-spline interpolation in imagecodec,” IEEE Trans. on Signal Processing, vol. 49, no. 6, pp. 1189-1197,June 2001 [2], the entire contents of which are hereby expresslyincorporated by reference, a cubic spline interpolation (CSI) isdeveloped in order to subsample image data to achieve compression. TheCSI scheme is combined with the JPEG algorithm to develop a modifiedJPEG encoder-decoder, which obtains a higher compression ratio and abetter quality of reconstructed image than the standard JPEG In the CSIalgorithm developed in [1], a fast Fourier transform (FFT) algorithmused in the modified JPEG encoder, is applied to perform the circularconvolution needed to compress and reconstruct image data.

Recently, the authors in [2] showed that if the size of compressed imageis not chosen to be power of two, the usual 2-D FFT is not the bestalgorithm needed to obtain the compressed image values. To overcome thisproblem, the authors proposed the Winograd discrete Fourier transform(WDFT) with the overlap-save method instead of the FFT to implement theCSI scheme. The disadvantage of this faster CSI algorithm is theoverlap-save method with its required boundary conditions. Thus thisalgorithm though faster is not readily realized as a real-timeprocessor.

Therefore, there is a need for a method and apparatus for a faster andmore efficient computation of a CSI for image signals.

SUMMARY OF THE INVENTION

In one embodiment, the present invention is a method performed by acomputer for encoding a signal including defining a 1-D cubic-splinefilter by $\begin{matrix}{{R(t)} = \left\{ {\begin{matrix}{{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\{{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {1 \leq {t} < 2} \\{0,} & {2 \leq {t}}\end{matrix};} \right.} & (1)\end{matrix}$

-   -   applying the filter to an input signal x(t) with $\begin{matrix}        \begin{matrix}        {{y_{j} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{x\left( {t + {j\quad\tau}} \right)}}}},} & \quad & {{0 \leq j \leq {n - 1}},}        \end{matrix} & (3)        \end{matrix}$    -    to compute y_(j);    -   computing B=[b₀,b₁, . . . ,b_(n−1)]_(C), where B denotes a        cyclic matrix of size n×n, where $\begin{matrix}        \begin{matrix}        {b_{k} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{{r\left( {t + {k\quad\tau}} \right)}.}}}} & \quad & \quad & {0 \leq k \leq {n - 1}}        \end{matrix} & (4)        \end{matrix}$    -    and where b₀=α=1.641, b₁=b_(n−1)=β=0.246, b₂=b_(n−2)=γ=−0.07,        b₃=b_(n−3)=δ=0.004, b₄=0, b₅=0, . . . , b_(n−4)=0;    -   computing A=B⁻¹, where A is a circular matrix of size n×n, where        A=[a ₀ ,a ₁ ,a ₂ ,a ₃ ,a ₄ ,a ₅ ,a ₆ , . . . ,a _(n−6) ,a _(n−5)        ,a _(n−4) ,a _(n−3) ,a _(n−2) ,a _(n−1)]_(C)  (6)    -    and where a₀=0.646, a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467,        a₃=a_(n−3)=−0.014, a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and        a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0; and    -   computing        X=B ⁻¹ Y=AY  (7)    -    by computing $\begin{matrix}        {x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\quad{a_{{({j - k})}_{n}}.}}}} & (8)        \end{matrix}$

BRIEF DESCRIPTION OF THE DRAWINGS

The features of this invention will become more apparent from aconsideration of the following detailed description and the drawings, inwhich:

FIG. 1 is an exemplary One-dimensional (1-D) cubic convolutioninterpolation function;

FIG. 2 is an exemplary 1-D compression filter, according to oneembodiment of the present invention;

FIG. 3 is an exemplary JPEG encode/decoder, according to one embodimentof the present invention; and

FIGS. 4(a)-4(d) are exemplary reconstructed images.

DETAIL DESCRIPTION

A cubic spline interpolation (CSI) for 2-D signals is performed by adirect computation in order to encode and decode compression for imagecoding. A pipeline structure in an electronic system or an integratedchip (IC) can be used to implement this new CSI. Such a new CSI methodcan be used along with the JPEG standard to obtain a new CSI-JPEGencoder-decoder (Codec) while still maintaining a good quality of thereconstructed image using higher compression ratios. In one embodiment,the present invention is a method and apparatus for a new CSI-JPEG codecenabling a pipelined compression method that is naturally suitable forhardware or firmware implementations.

In one aspect, the present invention is a method performed by a computerfor a direct computation of a cubic spline interpolation (CSI) for imagesignals. In another aspect, the present invention is an integrated chip(IC) that is configured to perform the above method. In yet anotheraspect, the present invention is a digital signal processor (DSP) thatis configured to perform the above method.

Since a large number of zeros exists in the interval occupied by thefilter coefficients of the CSI scheme (see, [1]), it is shown here thata direct computation, instead of using the FFT or WDFT algorithm, isdeveloped for computing the required circular convolution for any sizeof image. This new algorithm is utilized to aid in the JPEG standard toobtain a new JPEG codec. The advantage of this new CSI procedure overall other CSI methods (see, [1], and [2]) is that it can be implementedby a pipeline structure and is naturally suitable for very large scaleintegration (VLSI) implementation. The comparison of the operations ofthis new algorithm, the CSI algorithm, and the WDFT CSI algorithm showsthat the WDFT CSI algorithm requires fewer multiplications than both thenew CSI and the conventional CSI algorithms. However, the new type ofCSI algorithm requires less additions than both the WDFT CSI and theconventional CSI algorithms. Finally, computer runs show that for someimages of size 640×480, for example, the computation time of this newCSI-JPEG encoder that is implemented by a direct computation requiresonly 1.28 sec compared with 1.52 sec for the typical CSI-JPEG encoder of[1] and 1.11 sec for the WDFT CSI-JPEG encoder of [2] with almost thesame PSNR for the reconstructed image. That is, the new CSI-JPEG encoderrequires 0.24 sec less time than the typical CSI-JPEG encoder and 0.17sec less time than the WDFT CSI-JPEG encoder. In one embodiment, apipeline structure can be developed to implement the new CSI-JPEGencoder. As a result, the new CSI-JPEG encoder described here is easierto implement in hardware or firmware than the previous compressionalgorithms.

Direct Computation of the CSI Encoder for 2-D Image Signal

It was shown in [1] that the idea of the CSI scheme is to recalculatethe sampled values of the image data by means of the least-squaresmethod that uses of cubic convolution interpolation (CCI) formula. Inthis section it is shown that a direct computation can be utilized for2-D image data.

To illustrate the encoding algorithm of the CSI method developed in [1]and [2], let τ be a fixed positive integer. Also, let x(t) be periodicwith period N=nτ, where n is a positive integer. From S. Hou and H. C.Andrews, “Cubic splines for images interpolation and digital filtering,”IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp.1153-1160, December 1981[3], the entire content of which is herebyexpressly incorporated by reference, the 1-D CCI function r(t) shown inFIG. 1, is defined by $\begin{matrix}{{r(t)} = {\begin{Bmatrix}{{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\{{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {1 \leq {t} < 2} \\{0,} & {2 \leq {t}}\end{Bmatrix}.}} & (1)\end{matrix}$

One defines the k-th shift function of the CCI function r(t) asψ_(k)(t)=r(t−kτ) for 0≦k≦n−1, where r(t) is assumed to be a periodicfunction of period N. Also, let x_(i) for 0≦i≦n−1 be the compressedvalues (coefficients) at the sampling points which represent thecompressed data to be transmitted or stored. It follows from [1] thatthe set of equations as the circular convolution is given by$\begin{matrix}\begin{matrix}{{y_{j} = {\sum\limits_{k = 0}^{n - 1}{x_{k}b_{{({j - k})}_{n}}}}},} & \quad & {0 \leq j \leq {n - 1}}\end{matrix} & (2)\end{matrix}$where (j−k)_(n) denotes the residue (j−k) mod n, $\begin{matrix}\begin{matrix}{y_{j} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{x\left( {t + {j\quad\tau}} \right)}}}} & \quad & {{0 \leq j \leq {n - 1}},}\end{matrix} & (3) \\{and} & \quad \\\begin{matrix}{b_{k} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{{r\left( {t + {k\quad\tau}} \right)}.}}}} & \quad & {0 \leq k \leq {n - 1}}\end{matrix} & (4)\end{matrix}$

y_(j) in (2) is the n-point circular convolution of the compressed datax_(k) with the coefficients b_(k) obtained by equation (4) for 0≦k≦n−1.It was shown in [1] that using the FCSI-JPEG, the subjective quality ofthe reconstructed image for τ=2 is better than that of τ=3. Thus, thespecial case τ=2 is considered in this application. It is not difficultto show that the number of real multiplications and real additions thatare needed to compute equation (3) is 9×n and 8×n, respectively. Becauseof the periodicity of the CCI function r(t), from equation (4), oneobtains the coefficients b₀=α, b₁=b_(n−1)=β, b₂=b_(n−2)=γ, b₃=b_(n−3)=δ,b₄=0, b₅=0, . . . , b_(n−4)=0, where α, β, γ and δ are obtained by theuse of equation (8) in [1], and are shown below. These coefficients arethe autocorrelation coefficients between two CSI functions. Now let []^(T) denote the transpose of the column matrix X. Thus y_(j) inequation (2) can be expressed in matrix form as follows:Y=BX  (5)where Y=[y₀, y₁, . . . ,y_(n−1)]^(T), X=[x₀, x₁, . . . , x_(n−1)]^(T)and B=[b₀,b₁, . . . ,b_(n−1)]_(C) denotes the cyclic matrix of size n×n.From [1], one obtains the coefficients α=1.641, β=0.246, γ=−0.07 andδ=0.004. It follows from [1] that the FFT can be used to solve (2) or(5) for x_(k), where 0≦k≦n−1.

Because of a large number of zeros in the coefficients, i.e. b_(k) for0≦k≦n−1, computationally, the FFT algorithm of [1] is very inefficientto perform the circular convolution given in equation (2). In order tospeed up the CSI algorithm for any size of the compressed image, adirect computation, instead of the FFT or WDFF in [2], is developed toperform the n-point circular convolution given in equation (2) forsolving x_(k) for 0≦k≦n−1. In other words, one can compute the n-pointcircular convolution of any size of sampling points in equation (2) bythe means of a direct computation due to the large number of zeros ofthe coefficients, b_(k), where 0≦k≦n−1. To illustrate this, one firstfinds the inverse matrix of B given in equation (5). It is well knownfrom matrix theory that if B in equation (5) is a circular matrix, thenthe inverse matrix, namely A=B⁻¹ is also a circular matrix of size n×n.That is,A=[a ₀ ,a ₁ ,a ₂ ,a ₃ ,a ₄ ,a ₅ ,a ₆ , . . . ,a _(n−6) ,a _(n−5) ,a_(n−4) ,a _(n−3) ,a _(n−2) ,a _(n−1)]_(C)  (6)where a₀=0.646, a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014,a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0which can be pre-computed. Note that the values of constants a_(i) for6≦i≦n−6 are within 10⁻⁶. It can be shown by computer simulation thatthese constants can be assumed to be zero without degrading the qualityof the reconstructed image. The solution to Y=BX given in equation (5)will then be expressed in matrix form asX=B ⁻¹ Y=AY  (7)where A=B⁻¹ is a circulant matrix given in (6). (7) is thus reduced tothe form $\begin{matrix}{x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\quad a_{{({j - k})}_{n}}}}} & (8)\end{matrix}$

For τ=2, since the coefficients a_(i) for 0≦i≦n−1 given in equation (6)are periodic with period n, one can rearrange these coefficients asa⁻⁵=a₅=−0.0148, a₄=a₄=0.0046, a₃=a⁻³=−0.14, a⁻²=a₂=0.0467,a⁻¹=a₁=−0.109, and a₀=0.646. These coefficients shown in FIG. 2 arecalled the reconstructed filter coefficients. Because of the periodicityof the known data function y_(k)=y_(k+n), the compressed data x_(j) inequation (8) can be obtained by using the following linear correlationequation: $\begin{matrix}\begin{matrix}{x_{j} = {\sum\limits_{k = {- 5}}^{5}{a_{k}\quad y_{j + k}}}} & \quad & {0 \leq j \leq {n - 1}}\end{matrix} & (9)\end{matrix}$where the boundary conditions are y_(−i)=y_(n−i) for 5≦i≦11 andy_(i)=y_(i−n) for n≦i≦n+4. In equation (8), x_(j) can be obtained bycorrelating the known y_(j) in equation (3) for −5≦j≦n+4 with thereconstructed filter coefficients a_(k) for −5≦k≦5. Computing x_(j) inequation (9) involves n correlation coefficients of only 11 points forτ=2. Hence, the number of real multiplications and additions of thisdirect filter computation needed to implement x_(j) in equation (9) are11×n and 10×n, respectively. It is easy to see that the pipelinearchitecture can be developed to directly compute a linear correlation.Its modularity makes it well suitable for hardware or firmwareimplementation.

For the CSI algorithm implemented by the FFT algorithm, if n is a powerof two, in this case, n₁=n. Otherwise, the data y_(j) is expanded from npixels to n₁=2^(l) pixels by appending zeros to the edge of this data,where l is the smallest integer such that 2^(l)>n. For a given n, thenumber of real multiplications and additions needed to implement x_(j)in equation (9) is 4(2n₁ log n₁+n₁) and 2(2n₁ log n₁+n₁), respectively.

For the CSI algorithm implemented by the 9-point WDFT algorithm, if n isdivided by 7, then n=q·7+r, where q and r are the quotient and remainderof n, respectively. The number of the coefficients y_(j), namely n isdivided into q+a overlapping 9-point sub-functions, where a=1 if r−2≧1and a=0, if r−2≦0. Each 9-point sub-function is transformed by thedirect use of the 9-point WDFT. It follows from D. P. Kolba and T. W.Parks, “A prime factor FFT algorithm using high-speed convolution,” IEEETrans. Acoust., Speech, Signal Processing, vol. ASSP-25, no. 4, pp.281-294, August 1977[4], the entire content of which is hereby expresslyincorporated by reference, that the computation of the 9-point WDFTrequires 8 real multiplies, 49 real adders, and 2 shifts. Thus, thenumber of real multiplications and additions used to perform x_(j) in(9) is (q+a)(2×8+9) and (q+a)(2×49+9), respectively.

Since the decimation requires the computation of y_(j) in equation (3)and x_(j) in equation (9), the total numbers of real multiplies neededto compute the decimation using the FFT, the WDFT, and the directalgorithms are 9×n+4(2n log n+n), 9×n+(q+a)(2×8+9), and 9×n+11×n,respectively. However, the total number of real adders needed to computethe decimation using the FFT, the WDFT, and the direct algorithms are8×n+2(2n log n+n), 8×n+(q+a)(2×49+9), and 8×n+10×n, respectively. Thecomplexities of computing the decimation at different pixels using thethree algorithms given above are summarized in Table I.

These three algorithms are compared in this table giving the number ofreal multiplications and additions needed to perform the decimation.From this table, one observes that the FFT algorithm requiredsubstantially much more real multiplications and additions than any ofthe other two methods. The comparison of the direct method and WDFTalgorithm in Table I shows that the direct method requires about 1.59times more real multiplications and 1.30 times less real additions thanthat of the WDFF algorithm, respectively. However, as mentioned earlier,a pipeline convolution algorithm can be developed to compute the linearconvolution defined in equation (9). As a consequence, the direct methodcan be easier to implement in hardware or firmware than the FFT and WDFTalgorithms.

Let x(t₁,t₂) be a doubly periodic image signal of periods n₁τ and n₂τwith respect to the two integer variables t₁ and t₂, where n₁ and n₂ arealso integers. The 2-D cubic spline function r(t₁,t₂) is defined byr(t₁,t₂)=r(t₁)·r(t₂), where the 1-D CCI r(t_(i)) is given in (1) and isalso assumed to be a periodic function of period n_(i)τ for i=1, 2.Finally, let ψ_(k) ₁ _(,k) ₂(t₁,t₂)=r(t₁−k₁τ,t₂−k₂τ)=r(t₁−k₁τ)·r(t₂−k₂τ)=ψ_(k) ₁ (t₁)·ψ_(k) ₂ (t₂)for 0≦k_(i)≦n_(i)−1, where 1≦i≦2. Again, it is shown in [1] that the setof equations as 2-D circular convolution is $\begin{matrix}\begin{matrix}{y_{j_{1},j_{2}} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{x_{k_{1},k_{2}}b_{{({j_{1} - k_{1}})}_{n_{1}},{({j_{2} - k_{2}})}_{n_{2}}}}}}} \\{{{{{for}\quad 0} \leq j_{i} \leq {n_{i} - {1\quad{and}\quad i}}} = 1},2}\end{matrix} & (10)\end{matrix}$where, for 0≦j_(i)≦n_(i)−1 and i=1, 2, x_(k) _(i) _(,k) ₂ denotes thecompressed values at the sampling points, $\begin{matrix}{y_{j_{1},j_{2}} = {\sum\limits_{n_{1} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{\sum\limits_{n_{2} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{{x\left( {{m_{1} + {j_{1}\tau}},{m_{2} + {j_{2}\tau}}} \right)}{r\left( {m_{1},m_{2}} \right)}}}}} & (11)\end{matrix}$

-   -   for 0≦j_(i)≦n_(i)−1 and i=1,2 and $\begin{matrix}        \begin{matrix}        {b_{{({j_{1} - k_{1}})}_{n_{1}},{({j_{2} - k_{2}})}_{n2}} = {\sum\limits_{m_{1} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{\sum\limits_{m_{2} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{r\left( {{m_{1} + {\left( {j_{1} - k_{1}} \right)\tau}},{m_{2} +}} \right.}}}} \\        {\left. {\left( {j_{2} - k_{2}} \right)\tau} \right){r\left( {m_{1},m_{2}} \right)}}        \end{matrix} & (12)        \end{matrix}$

Recall that r(m₁,m₂)=r(m₁)r(m₂). Then equation (11) becomes$\begin{matrix}{\begin{matrix}{y_{j_{1},j_{2}} = {\sum\limits_{m_{2} = {{{- 2}\quad\tau} - 1}}^{{2\quad\tau} - 1}\left( {\sum\limits_{m_{1} = {{{- 2}\quad\tau} - 1}}^{{2\quad\tau} - 1}{x\left( {{m_{1} + {j_{1}\tau}},} \right.}} \right.}} \\{\left. {\left. {m_{2} + {j_{2}\tau_{2}}} \right){r\left( m_{1} \right)}} \right){r\left( m_{2} \right)}} \\{= {\sum\limits_{m_{2} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{{y\left( {{j_{1}\tau},{m_{2} + {j_{2}\tau}}} \right)}{r\left( m_{2} \right)}}}}\end{matrix}{where}{{y\left( {{j_{1}\tau},{m_{2} + {j_{2}\tau}}} \right)} = {\sum\limits_{m_{1} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{{x\left( {{m_{1} + {j_{1}\tau}},{m_{2} + {j_{2}\tau}}} \right)}{{r\left( m_{1} \right)}.}}}}} & (13)\end{matrix}$

To obtain in equation (13), we first convolve the 1-D CCI function r(t)with each row of the data matrix x(t₁,t₂). This resulting data matrix isthen convolved by column with the same CCI function r(t).

In equation (10), the 1-D convolution can be used to solve for x_(k) ₁_(,k) ₂ . To illustrate this, again sincer(t₁−k₁τ,t₂−k₂τ)=r(t₁−k₂τ)r(t₂−k₂τ) and r(m₁,m₂)=r(m₁)r(m₂), thenequation (12) becomes $\begin{matrix}\begin{matrix}{b_{{({j_{1} - k_{1}})}_{m_{1}},{({j_{2} - k_{2}})}_{n_{2}}} = {\sum\limits_{m_{1} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{\sum\limits_{m_{2} = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{{r\left( {m_{1} + {\left( {j_{1} - k_{1}} \right)\tau}} \right)}r}}}} \\{\left( {m_{2} + {\left( {j_{2} - k_{2}} \right)\tau}} \right){r\left( m_{1} \right)}{r\left( m_{2} \right)}} \\{= {b_{{({j_{1} - k_{1}})}_{h_{1}}} \cdot b_{{({j_{2} - k_{2}})}_{n_{2}}}}}\end{matrix} & (14)\end{matrix}$where$b_{{({j_{1} - k_{1}})}_{ni}} = {\sum\limits_{m = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} - 1}{{r\left( {m_{i} + {\left( {j_{i} - k_{i}} \right)\tau}} \right)}{r\left( m_{i} \right)}}}$)τ)r(m_(i)) for i=1, 2. The substitution of equation (14) into equation(10) yields the following 2-D circular convolution: $\begin{matrix}{y_{j_{1},j_{2}} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{x_{k_{1},k_{2}}{b_{{({j_{1} - k_{1}})}_{n_{1}}} \cdot b_{{({j_{2} - k_{2}})}_{n_{2}}}}}}}} & (15)\end{matrix}$

Equation (15) can be decomposed into two n₁- and n₂-point cyclicconvolutions as follows: $\begin{matrix}{{z_{j_{1},k_{2}} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{x_{k_{1},k_{2}}b_{{({j_{1} - k_{1}})}_{n_{1}}}}}},} & (16) \\{y_{j_{1},j_{2}} = {\sum\limits_{k_{2} = 0}^{n_{2} - 1}{z_{j_{1},k_{2}}b_{{({j_{2} - k_{2}})}_{n_{2}}}}}} & (17)\end{matrix}$where the filter coefficients b₀, b₁, . . . , b_(n) _(i) ⁻¹ for i=1, 2are given in [1].

To obtain y_(j) ₁ _(,j) ₂ in equation (16) one first convolves each nowof the data matrix x_(k) ₁ _(,k) ₂ with the coefficients b₀, b₁, . . . ,b_(n) ₁ ⁻¹. The resulting data matrix is then convolved with thefunction coefficients b₀, b₁, . . . , b_(n) ₂ ⁻¹. In order to find x_(k)₁ _(,k) ₂ in equation (17) from the known data y_(j) ₁ _(,j) ₂ and thecoefficients b₀, b₁, . . . , b_(n) _(i) ⁻¹ for i=1, 2, equations (16)and (17) can be expressed explicitly in matrix from as follow:z _(j) ₁ _(,k) ₂ =[z _(0,k) ₂ , . . . , z _(n) _(i) _(−1,k) ₂ ]^(T) =B ₁[x _(0,k) ₂ ,x _(1,k) ₂ , . . . x _(n) ₁ _(−1,k) ₂ ]^(T) for 0≦k ₂ ≦n₂−1  (18)y _(j) ₁ _(,j) ₂ =[y _(j) ₁ _(,0) ,y _(j) _(i) _(,1), . . . , y_(j) ₁_(,n) ₂ ⁻¹]^(T) =B ₂ [z _(j) ₁ _(,0) ,z _(j) ₁ _(,1) , . . . ,z _(j)_(i) _(,n) ₂ ⁻¹]^(T) for 0≦j ₁ ≦n ₁−1  (19)where the matrices B_(i)=[b₀,b₁,b₂, . . . ,b_(n) _(i) ⁻¹]^(C) for i=1,2. In equation (15), the direct computation can be used to solve forx_(k) ₁ _(,k) ₂ . To see this, multiplying both sides of equation (19)by the inverse matrix B₂ ⁻¹, yieldsz _(j) ₁ _(,k) ₂ =[z _(j) ₁ _(,0),z_(j) _(i) _(,1), . . . , z_(j) ₁_(,n) ₂ ⁻¹ ]=B ₂ ⁻¹ [y _(j) ₁ _(,0) ,y _(j) ₁ _(,1) , . . . ,y _(j) _(i)_(,n) ₂ ⁻¹]^(T) for 0≦j ₁ ≦n ₁−1  (20)

Similarly, multiplying both sides of equation (18) by the inverse matrixB₁ ⁻¹, one obtainsx _(k) ₁ _(,k) ₂ =[x _(0,k) ₂ ,x _(1,k) ₂ , . . . ,x _(n) ₁ _(−1,k) ₂]=B ₁ ⁻¹ [z _(0,k) ₂ ,z _(1,k) ₂ , . . . , z _(n) _(i) _(−1,k) ₂ ]^(T)for 0≦k₂ ≦n ₂−1  (21)

It follows from equations (20) and (21) that, to solve x_(k) ₁ _(,k) ₂in equation (14), the reconstructed filter coefficients of FIG. 2 a ₁,a₂, . . . , a_(n) ₂ ⁻¹ are convolved with each column of the compressedmatrix Y_(j) ₁ _(,j) ₂ to obtain z_(j) ₁ _(,k) ₂ then, the row of thisresulting data matrix z_(j) ₁ _(,k) ₂ is convolved with the filtercoefficients a₁, a₂, . . . , a_(n) ₂ ⁻¹ to obtain the reconstructedimage of x_(k) ₁ _(,k) ₂.

The new encoding method for the 2-D image data for τ=2 is summarized inthe following three steps:

-   -   1. Apply equation (13) with the 1-D CCI function given in (1) to        the original image to find all of the coefficients y_(j) ₁ _(,j)        ₂. In other words convolve the 1-D CCI function with each        coordinate of the data matrix x(m₁, m₂) to obtain the        coefficients y_(j) ₁ _(,j) ₂.    -   2. From the known b_(j), the circular matrix in equation (5) can        be constructed. Then, compute the inverse matrix of B, namely A.        Finally, the compressed filter coefficients a_(j) for −5≦j≦5 can        be found by the use of the matrix in equation (5).    -   3. Apply equation (20) and equation (21) with the compressed        filter coefficients a_(j) to obtained the reconstructed data        x_(k) ₁ _(,k) ₂ . In other words convolve the filter        coefficients a_(j) for −5≦j≦5 with each coordinate of the matrix        y_(j) ₁ _(,j) ₂ to obtain the reconstructed image of x_(k) ₁        _(,k) ₂        Decoding Algorithm of the Compressed 2-D Signal

In the decoding process, using the reconstructed values as the samplingpoints (e.g., the x_(k) ₁ _(,k) ₂ data), the reconstructed pointsbetween the sampling points are obtained by means of the CCI functiongiven in equation (1). To do this, since the transformed image datax_(k) ₁ _(,k) ₂ for 0≦k_(i)≦n_(i)−1 and i=1,2 are known, the 2-Dreconstructed image s(t₁,t₂) is the 2-D convolution of the 2-D CCIfunction r(t₁,t₂)=r(t₁)·r(t₂) and the 2-D sampled waveform x_(k) ₁ _(,k)₂. Since r(t₁−k₁τ,t₂−k₂τ)=r(t₁−k₁τ)·r(t₂−k₂τ), then, for τ=2 in equation(12) in [1] becomes $\begin{matrix}{{s\left( {t_{1},t_{2}} \right)} = {\sum\limits_{k_{1} = 0}^{n_{1} - 1}{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{x_{k_{1},k_{2}}{r\left( {t_{1} - {2k_{1}}} \right)}{r\left( {t_{2} - {2k_{2}}} \right)}}}}} & (22)\end{matrix}$

Equation (22) can be decomposed into two n₁- and n₂-point convolutionsas follows: $\begin{matrix}{{{s\left( {t_{1},k_{2}} \right)} = {{\sum\limits_{k_{1} = 0}^{n_{1} - 1}{x_{k_{1},k_{2}}{r\left( {t_{1} - {2k_{1}}} \right)}\quad{for}\quad 0}} \leq k_{2} \leq {n_{2} - 1}}}{{{and}\quad 0} \leq t_{1} \leq {2n_{1}\quad{and}}}} & (23) \\{{s\left( {t_{1},t_{2}} \right)} = {{\sum\limits_{k_{2} = 0}^{n_{2} - 1}{{s\left( {t_{1},k_{2}} \right)}{r\left( {t_{2} - {2k_{2}}} \right)}\quad{for}\quad 0}} \leq t_{2} \leq {2n_{2}}}} & (24)\end{matrix}$

Thus, from equations (23) and (24) the discrete data of each row can beinterpolate from the transformed and compressed image data x_(k) ₁ _(,k)₂ with a similar interpolation for the given discrete data of eachcolumn. The above method is a bilinear interpolation described in W. K.Pratt, Digital Images Processing, 2^(nd) ed., New York: Wiley, 1991[5],the relevant contents of which are hereby expressly incorporated byreference.

Fast JPEG Encoder and Decoder

According to one embodiment of the present invention, a new JPEGencoder-decoder model includes the 1/4 new CSI scheme for preprocessingand the 1/4 CSI for the post-processing needed in the standard JPEGalgorithm as shown in FIG. 3. In this model, an original image in RGBcolor space is converted to another preliminary image in YUV color space(see, for example, [5]) prior to the 1/4 CSI process.

There are two steps in the encoder. The first step is the pre-processingwhich uses the 1/4 new CSI scheme for the Y, U, and V images,individually. At the end of the 1/4 CSI computation, these separate Y,U, and V images are combined to yield one YUV image. The second step forpost-processing uses the 1/4 CSI for the Y, U, and V images. Finally,the Y, U, and V images are combined to yield the YUV format. Then, thisYUV images is converted to the final reconstructed RGB image.

Let x_(k) ₁ _(,k) ₂ and s_(k) ₁ _(,k) ₂ be the original andreconstructed images, respectively, and let k₁, k₂ for 0≦k₁≦M−1 and0≦k₂≦N−1 be the index numbers that determine the vertical and horizontalpositions of objects in the images. The mean square error (MSE) and thePSNR of the 2-D signal are given in [1], respectively.

Experimental results for the 2-D signal image are compared using the CSIscheme of [1], the fast CSI schemes given in [2], and the new CSI schemegiven in this application. These results are computed and are shown inTable II. The PSNR of the 2-D signal are calculated for the standardimages of size 512×512. That is, the original image is decimated by thedisclosed CSI scheme to obtain data samples with a compression ratio of4:1. In addition, the reconstructed values between the sampling pointsare interpolated by the 1/4 CCI in equation (1) to obtain thereconstructed image. It is seen from this table that the new CSI schemehas the same PSNR as the CSI and the FCSI schemes.

Table III lists the PSNR values of the RGB color Lena reconstructedimage of size 512 by 512 at different compression ratios for theCSI-JPEG of [1], the FCSI-JPEG of [2], and the new CSI-JPEG codecdisclosed in this application. From this table, one observes that forthe same compression ratios, the PSNR of the image of the new CSI-JPEGcodec are similar to the two other schemes.

FIG. 4 shows the reconstructed image of Lena at the same compressionratio of 100:1, using the conventional CSI-JPEG, FCSI-JPEG, and the newCSI-JPEG codec. Clearly, the Lena image using our disclosed methodindicates a subjective quality of reconstructed image similar to any ofthe other two methods. The image of FIG. 4 is of size 512 by 512 with acompression ratio 100:1. FIG. 4(a) is an original Lena image, FIG. 4(b)is an image reconstructed by the original CSI-JPEG with the PSNR of Yimage equivalent to 35.08 dB, FIG. 4(c) shows an image reconstructed bythe FCSI-JPEG with PSNR of Y image equivalent to 35.07 dB, and FIG. 4(d)depicts an image reconstructed by the new CSI-JPEG with PSNR of Y imageequivalent to 34.51 dB.

Finally, the new CSI-JPEG codec was implemented on a 800-MHz IntelPentium III personal computer using a C program. The computation time ofthis new simplified algorithm is given in Table IV. It follows fromTable IV that the new CSI-JPEG encoder requires 1.28 sec compared with1.11 sec for the FCSI-JPEG encoder and 1.52 sec for the CSI-JPEGencoder, respectively. Although, the new CSI-JPEG decoder requiresalmost the same computation time as the other two decoders, the newCSI-JPEG encoder requires 0.17 sec more time and 0.24 sec less time thanthe FCSI-JPEG decoder and the CSI-JPEG decoder, respectively.

It will be recognized by those skilled in the art that variousmodifications may be made to the illustrated and other embodiments ofthe invention described above, without departing from the broadinventive scope thereof. It will be understood therefore that theinvention is not limited to the particular embodiments or arrangementsdisclosed, but is rather intended to cover any changes, adaptations ormodifications which are within the scope and spirit of the invention asdefined by the appended claims. TABLE 1 COMPLEXITY OF COMPUTINGDECIMATION USING FFT, WDFT AND DIRECT METHODS FFT Algorithm WDFTAlgorithm Direct Algorithm No. Real No. Real No. Real No. Real No. RealNo. Real n Mult. Add. Mult. Add. Mult. Add. 640 22688 13584 8110 1517812800 11520 480 16536 9948 6020 11116 9600 8640 320 10573 6407 4130 79106400 5760 240 7690 4685 3060 5771 4800 4320 352 11747 7106 4468 83807040 6336 288 9410 5713 3592 6798 5760 5184 512 17753 10668 6458 1201410240 9216

TABLE II PSNR (dB) OF 2-D IMAGE OF SIZE 512 BY 512 WITH COMPRESSIONRATIO 4:1 FOR τ = 2 Image CSI [1] FCSI [2] NEW CSI Peppers 33.21 33.2032.89 Lake 30.87 30.86 30.51 Couple 30.51 30.36 30.23 Crowd 33.86 33.8633.42 Lena 35.08 35.07 34.51

TABLE III PSNR VALUE AT DIFFERENT COMPRESSION RATIOS OF 2D IMAGE OF SIZE512 BY 512 for CSI-JPEG, FCST-JPEG, AND NEW CSI-JPEG CODE PSNR (dB)Compress CSI-JPEG FCSI-JPEG NEW CSI-JPEG Ratio Y U V Y U V Y U V 125:130.27 34.89 35.00 30.37 34.98 35.09 29.89 33.55 34.56 100:1 31.20 35.6635.85 31.15 35.64 35.67 30.92 35.42 35.66  75:1 32.15 36.45 36.42 32.1436.41 36.33 32.02 36.10 36.37  50:1 33.36 37.42 37.32 33.51 37.51 37.2433.28 37.32 37.28

TABLE IV COMPUTATION TIME (SECONDS) OF THE COLOR IMAGE OF SIZE 640 BY480 AT THE COMPRESSION RATIO OF 100:1 IMPLEMENTED ON A 800 MHz INTELPENTIUM III PERSONAAL COMPUTER. Encoder Decoder JPEG JPEGB DecimationEncoder Total Decoder Interpolation Total CSI-JPEG 0.68 0.84 1.52 0.230.1 0.33 FCSI-JPEG 0.27 0.84 1.11 0.23 0.08 0.31 nCSI-JPEG 0.44 0.841.28 0.23 0.1 0.33

1. A method performed by a computer for encoding a signal comprising:defining a 1-D cubic-spline filter by $\begin{matrix}{{R(t)} = \left\{ {\begin{matrix}{{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\{{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {{1 \leq {t} < 2};} \\{0,} & {2 \leq {t}}\end{matrix};} \right.} & (1)\end{matrix}$  applying the filter to an input signal x(t) with$\begin{matrix}\begin{matrix}{{y_{j} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{x\left( {t + {j\quad\tau}} \right)}}}},} & \quad & {{0 \leq j \leq {n - 1}},}\end{matrix} & (3)\end{matrix}$ to compute y_(j); computing B=[b₀,b₁, . . . ,b_(n−1)]_(C),where B denotes acyclic matrix of size n×n, where $\begin{matrix}\begin{matrix}{b_{k} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{{r\left( {t + {k\quad\tau}} \right)}.}}}} & \quad & \quad & {0 \leq k \leq {n - 1}}\end{matrix} & (4)\end{matrix}$  and where b₀=α=1.641, b₁=b_(n−1)=0.246,b₂=b_(n−2)=γ=−0.07, b₃=b_(n−3)=δ=0.004, b₄₌₀, b₅₌₀, . . . , b_(n−4)=0;computing A=B⁻¹, where A is a circular matrix of size n×n, whereA=[a ₀ ,a ₁ ,a ₂ ,a ₃ ,a ₄ ,a ₅ ,a ₆ , . . . ,a _(n−6) ,a _(n−5) ,a_(n−4) ,a _(n−3) ,a _(n−2) ,a _(n−1)]_(C)  (6)  and where a₀=0.646,a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014,a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and  a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0;and computingX=B ⁻¹ Y=AY  (7)  by computing $\begin{matrix}{x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\quad{a_{{({j - k})}_{n}}.}}}} & (8)\end{matrix}$
 2. A JPEG decoder comprising: means for defining a 1-Dcubic-spline filter by $\begin{matrix}{{R(t)} = \left\{ {\begin{matrix}{{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\{{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {1 \leq {t} < 2} \\{0,} & {2 \leq {t}}\end{matrix};} \right.} & (1)\end{matrix}$  means for applying the filter to an input signal x(t)with $\begin{matrix}\begin{matrix}{{y_{j} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{x\left( {t + {j\quad\tau}} \right)}}}},} & \quad & {{0 \leq j \leq {n - 1}},}\end{matrix} & (3)\end{matrix}$ to compute y_(j); means for computing B=[b₀,b₁, . . .,b_(n−1)]_(C), where B denotes a cyclic matrix of size n×n, where$\begin{matrix}\begin{matrix}{b_{k} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{{r\left( {t + {k\quad\tau}} \right)}.}}}} & \quad & \quad & {0 \leq k \leq {n - 1}}\end{matrix} & (4)\end{matrix}$  and where b₀=α=1.641, b₁=b_(n−1)=β=0.246,b₂=b_(n−2)=γ=−0.07, b₃=b_(n−3)=δ=0.004, b₄=0, b₅=0, . . . , b_(n−4)=0;means for computing A=B⁻¹, where A is a circular matrix of size n×n,whereA=[a ₀ ,a ₁ ,a ₂ ,a ₃ ,a ₄ ,a ₅ ,a ₆ , . . . ,a _(n−6) ,a _(n−5) ,a_(n−4) ,a _(n−3) ,a _(n−2) ,a _(n−1)]_(C)  (6)  and where a₀=0.646,a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014,a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and  a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0;and means for computingX=B ⁻¹ Y=AY  (7)  by computing $\begin{matrix}{x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}\quad{a_{{({j - k})}_{n}}.}}}} & (8)\end{matrix}$
 3. A digital signal processor (DSP) having stored thereona set of instructions including instruction for processing a signal, theinstruction when executed by the DSP perform the steps of: defining a1-D cubic-spline filter by $\begin{matrix}{{R(t)} = \left\{ {\begin{matrix}{{{\left( {3/2} \right){t}^{3}} - {\left( {5/2} \right){t}^{2}} + 1},} & {0 \leq {t} < 1} \\{{{{- \left( {1/2} \right)}{t}^{3}} + {\left( {5/2} \right){t}^{2}} - {4{t}} + 2},} & {1 \leq {t} < 2} \\{0,} & {2 \leq {t}}\end{matrix};} \right.} & (1)\end{matrix}$  applying the filter to an input signal x(t) with$\begin{matrix}\begin{matrix}{{y_{j} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{x\left( {t + {j\quad\tau}} \right)}}}},} & \quad & {{0 \leq j \leq {n - 1}},}\end{matrix} & (3)\end{matrix}$ to compute y_(j); computing B=[b₀,b₁, . . . ,b_(n−1)]_(C),where B denotes a cyclic matrix of size n×n, where $\begin{matrix}\begin{matrix}{b_{k} = {\sum\limits_{t = {{{- 2}\quad\tau} + 1}}^{{2\quad\tau} + 1}{{r(t)}\quad{{r\left( {t + {k\quad\tau}} \right)}.}}}} & \quad & \quad & {0 \leq k \leq {n - 1}}\end{matrix} & (4)\end{matrix}$  and where b₀=α=1.641, b₁=b_(n−1)=β=0.246,b₂=b_(n−2)=γ=0.07, b₃=b_(n−3)=δ=0.004, b₄=0, b₅=0, . . . , b_(n−4)=0;computing A=B⁻¹, where A is a circular matrix of size n×n, whereA=[a ₀ ,a ₁ ,a ₂ ,a ₃ ,a ₄ ,a ₅ ,a ₆ , . . . ,a _(n−6) ,a _(n−5) ,a_(n−4) ,a _(n−3) ,a _(n−2) ,a _(n−1)]_(C)  (6)  and where a₀=0.646,a₁=a_(n−1)=−0.109, a₂=a_(n−2)=0.0467, a₃=a_(n−3)=−0.014,a₄=a_(n−4)=0.0046, a₅=a_(n−5)=−0.00148, and  a₆≅a₇≅a₈≅ . . . ≅a_(n−6)≅0;and computingX=B ⁻¹ Y=AY  (7)  by computing $\begin{matrix}{x_{j} = {\sum\limits_{k = 0}^{n - 1}{y_{k}{a_{{({j - k})}_{n}}.}}}} & (8)\end{matrix}$